Intrinsic dimension of a dataset: what properties does one expect? 
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Abstract — We propose an axiomatic approach to the concept 
of an intrinsic dimension of a dataset, based on a viewpoint 
of geometry of high-dimensional structures. Our first axiom 
postulates that high values of dimension be indicative of the 
presence of the curse of dimensionality (in a certain precise 
mathematical sense). The second axiom requires the dimension 
to depend smoothly on a distance between datasets (so that the 
dimension of a dataset and that of an approximating principal 
manifold would be close to each other). The third axiom is 
a normalization condition: the dimension of the Euclidean 
n-sphere S" is O(n). We give an example of a dimension 
function satisfying our axioms, even though it is in general com- 
putationally unfeasible, and discuss a computationally cheap 
function satisfying most but not all of our axioms (the "intrinsic 
dimensionality" of Chavez et al.) 

I. Introduction 

A search for the "right" concept of intrinsic dimension 
of a dataset is not yet over, and most probably one will 
have to settle for a spectrum of various dimensions, each 
serving a particular purpose, complementing each other. (Cf. 
0, 0, 0, El, OH, QH, US, and references therein.) At 
the same time, it is quite clear that the word "dimension" 
has a rather specific meaning in this context. High values 
of dimension are invariably associated with the curse of 
dimensionality, while the low values are expected to contain 
useful information, for instance, about a non-linear manifold 
approximating the dataset. Is it too much to expect of a 
dimension function? 

Here we are trying to address the problem of existence 
of dimension functions making sense for all datasets and 
satisfying the above two requirements, within the contraints 
of a certain mathematical model. Datasets are modelled by 
spaces (X, d, /i) equipped with a distance d and a probability 
distribution \i, while features of datasets correspond to 1- 
Lipschitz (non-expanding) functions / on X. The curse of 
dimensionality describes a situation where the features are 
sharply concentrated around their means. In geometric terms, 
one speaks here of the phenomenon of concentration of mea- 
sure on high-dimensional structures [12]. This phenomenon 
admits well-understood quantitative measures ifTol . 0, Q, 
which enable us to express in precise mathematical terms 
the following condition on an instrinsic dimension function: 
high values of dimension are indicative of the presence of 
the curse of dimensionality. 

Geometry of high dimensions (asymptotic geometric anal- 
ysis) has in store a concept of a distance between spaces 
with metric and measure, X and Y, which, in our view, 
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could — in one form or other — eventually become very 
useful in principal manifold theory. We describe this notion, 
due to Gromov 0, and state the second axiom: if the 
Gromov distance between two spaces is small, their intrinsic 
dimensions should be close to each other. 

The third axiom serves a normalization purpose by stating 
that the intrinsic dimension of the Euclidean sphere §™ 
should be on the order of n. 

Paradoxically, any dimension function of the suggested 
kind always assigns to a singleton the value +oo, however 
this does not lead to any problems or contradictions. 

We give an example of a dimension function satisfying 
the axioms, and compute its values for the spheres §". In 
general, however, this function is computationally unfeasible. 
We discuss in this connection the "intrinsic dimensionality" 
by Chavez et al., easy to compute and already having uses 
in data engineering [3|, which satisfies some, but not all, of 
our axioms. 

II. Preliminaries 

A. Metric spaces with measure as models for datasets 

A geometric model for a dataset ifTTI . lfT2l is a metric 
space with measure IfTol . 0, that is, a triple (X, d, /j,), where 
X is a set equipped with a metric d and a probability measure 
distribution /i. Sometimes /i is thought of as an underlying 
distribution for the actual set of data, else one can associate 
to X the normalized counting measure fi(A) = §{A)/${X). 

In some situations, especially in sequence-based biology, 
a metric d has to be replaced with a more general similarity 
measure between datapoints, such as a quasimetric [13|. 

B. 1-Lipschitz functions as models for features 

Features of datasets correspond in the above setting to 
functions / on X taking values in the real numbers, the Eu- 
clidean space, or another target space (such as e.g. a discrete 
set). The features are assumed to depend smoothly on the 
distance between datapoints. After a suitable normalization, 
one can usually assume such a function, /, to be 1-Lipschitz: 
for all x,y S X, one has 

\f(x)-f(y)\<d(x,y). 

The features are in a sense the "observable quantities" of a 
dataset. 

C. Observable diameter and concentration phenomenon 

The curse of dimensionality is a name given to the 
situation where all or some of the important features of 
a dataset sharply concentrate near their median (or mean) 
values and thus become non-discriminating. In such cases, 
X is perceived as intrinsically high-dimensional. This set 



of circumstances covers a whole range of well-known high- 
dimensional phenomena such as for instance sparseness of 
points (the distance to the nearest neighbour is comparable 
to the average distance between two points |fl~)), etc. It 
has been argued in Ifl2ll that a mathematical counterpart of 
the curse of dimensionality is the well-known concentration 
phenomenon (9), 0, which can be expressed, for instance, 
using Gromov's concept of the observable diameter 0. 

Let (X, d, /j,) be a metric space with measure, and let k > 
be a small fixed threshold value. The observable diameter 
of X is the smallest real number, D — ObsDiam K (X), with 
the following property: for every two points x, y, randomly 
drawn from X with regard to the measure /i, and for any 
given 1-Lipschitz function /: X — > R (a feature), the 
probability of the event that values of / at x and y differ 
by more than D is below the threshold value: 

P[\f(x)-f(y)\>D]<K. 

Informally, the observable diameter ObsDiam K (X) is the 
size of a dataset X as perceived by us through a series 
of randomized measurements using arbitrary features and 
continuing until the probability to improve on the previous 
observation gets too small. The observable diameter has little 
(logarithmic) sensitivity to k. 

The characteristic size CharSize (X) of X as the median 
value of distances between two elements of X. The concen- 
tration of measure phenomenon refers to the observation that 
"natural" families of geometric objects (X n ) often satisfy 

ObsDiam K (AT„) <C CharSize (X n ) as n ~ * oo. 

A family of spaces with metric and measure having the above 
property is called a Levy family. Here the parameter n usually 
corresponds to dimension of an object defined in one or 
another sense. 

For the Euclidean spheres § n of unit radius, equipped 
with the usual Euclidean distance and the (unique) rotation- 
invariant probability measure, one has CharSize(§") -> y/2, 
while ObsDiam(§") = 0(l/y/n). Fig. Q] shows observable 
diameters (indicated by inner circles) corresponding to the 
threshold value k = 10~ 10 of spheres §" in dimensions 
n = 3, 10, 100, 2500, along with projections to the two- 
dimensional screen of randomly sampled 1000 points. 

Other important examples of Levy families iflOl . Q, 
include: (i) Hamming cubes {0, 1}" of two-bit n-strings 
equipped with the normalized Hamming distance d(cr, r) = 
crj ^ Ti] and the counting measure; (ii) groups SU(n) 
of special unitary n x n matrices, with the geodesic distance 
and Haar measure (unique invariant probability measure); 
(iii) any family of expander graphs (0, p. 197) with the 
normalized counting measure on the set of vertices and the 
path metric. 

Any dataset whose observable diameter is small relative to 
the characteristic size will be suffering from dimensionality 
curse. 




Fig. 1 

Observable diameter of the sphere S™, n = 3, 10, 100, 2500. 



D. Concentration function 

A convenient way to quantify the concentration phe- 
nomenon is provided by the concentration function, a(e), of 
a space (X,d,/j) ifTol . Q. Here is a definition in terms of 
features (1-Lipschitz functions). Denote by Mf the median 
value of a function /, that is, a number such that 

fi{x G X: f{x) > M f } > 1 p{x G X : f{x) < Mf} > ±. 

Now set a(0) = i, and for every e > 

a[e) = sup fi{x e X : f(x) > Mf + s} , (1) 

where the supremum is taken over all 1-Lipschitz real-valued 
functions on X. Thus, the value a(e) of the concentration 
function gives an upper bound on the probability of a large 
deviation of any feature from its median. Equivalently, 

a(e) = 1 - inf (J,(A e ), 

where A e denotes the e-neighbourhood of A in X (the set of 
all X at a distance < e to some point in A), and the infimum 
is taken over all subsets A C X satisfying fi(A) > i. 

A family (X n ) of spaces with metric and measure is a 
Levy family as defined in Subsection IH-CI if and only if 
the values of concentration functions ax n (s) converge to 
zero pointwise for every e > 0. Concentration functions of 
spheres in various dimensions are shown in Fig. |2] 

E. Gromov distance 

We proceed to describe a distance between spaces with 
metric and measure as introduced by Gromov 0, p. 200. 

Recall that the Hausdorff distance between two subsets A 
and B of a metric space (X, d) is the smallest e > with 
the property 

A C B e and B C A e . 



Concentration functions of n-spheres 




Fig. 2 

Concentration functions of a ?i-spheres for various n 



(The e-neighbourhood, A s , of A was defined above in III-DI ) 
Let (X, dx, Hx) and (Y, dy, Py) be two spaces with met- 
ric and measure. Denote by Cip^X) and Cip l (Y) the spaces 
of 1-Lipschitz real-valued functions (i.e., features) on X and 
on Y, respectively. Informally, the Gromov distance between 
X and Y is the Hausdorff distance between Cip^X) and 
£ip 1 (X). Of course, in order to measure it, one needs to 
"pull back" all the functions to a common third space. 

This space is the function space on the unit interval [0, 1]. 
It is a standard result in measure theory that every measure 
space (X, n) (under mild restrictions met e.g. by every space 
with metric and measure) admits a parametrization, that is, 
a mapping <fi: [0, 1] — > X with the property: for all AC. X, 
fi(A) equals the Lebesgue measure of (f>^ 1 (A). For instance, 
if X is a finite set with the normalized counting measure, 
then <fr would be a function taking a constant value x € X 
on each of n = §(X) intervals of equal measure. 

Choose parametrizations <f> for X and i/j for Y, and denote 
4>* Cip^X) the set of all functions of the form / o <fi, / e 
Cip^X), and similarly ip* Cip^Y). Both cffCip^X) and 
ip* Cip 1 (Y) are subspaces of the space L 1 (0, 1) of integrable 
functions on the unit interval. Equip the latter space with the 
following metric, determining the convergence in measure: 

mei(/,s) = inf {e > 0: n{x: \f(x) - g(x)\ > e} < e} . 

Now the Gromov distance d conc (X, Y) is the infimum of 
Hausdorff distances between the subsets (f>*£ip 1 (X) and 
ijj* Cip 1 (Y), taken over all possible parametrizations <f> and 
tp. Fig. [3] illlustrates the concept. 

Theorem 1 (Gromov): A family (X n ) of spaces with met- 
ric and measure is a Levy family if and only if X n converges 
in the distance d conc to a singleton {*}. 

The closer a dataset X is to a singleton {*} in Gromov's 
distance, the higher its intrinsic dimensionality is and the 
more it resembles a "black hole" from the viewpoint of data 




X 

Fig. 3 

TO THE CONCEPT OF GROMOV'S DISTANCE 



analysis, because all the features simultenaously become less 
and less discriminaing. This reflects the fact that on a space 
of high intrinsic dimension the features are e-contant on a 
set of measure > 1 — 2ax(s), which is close to 1 already for 
small values of e > 0. Consequently, the Hausdorff distance 
between Cip^X) and the set of functions on {*} (that is, 
constant functions) is close to zero. 

III. Main results 

A. Axiomatic approach to intrinsic dimension 

Let d be a function assigning to every space (X, d, /i) with 
metric and measure either a non-negative real number or the 
symbol +oo. We will say that d is an intrinsic dimension 
function if it satisfies the following three axioms. 

1) axiom of concentration: For a family (X n ) of spaces 
with metric and measure, d(X n ) | oo if and only if (X n ) 
forms a Levy family. 

This axiom formalizes a requirement that the intrinsic 
dimension is high if and only if a dataset suffers from the 
curse of dimensionality. 

2) axiom of smooth dependence on datasets: If 
d conc (X n ,X) -> 0, then d(X n ) -> d(X). 

This axiom is necessary to assure that if a dataset X is 
well-approximated by a non-linear manifold M, then the 
instrinsic dimension of X is close to that of M. 

3) axiom of normalization: d(S n ) — 0(n)Q 

This axiom serves to properly calibrate the values of the 
intrinsic dimension. 

Remark 2: Instead of spheres, one can use normalized 
hypercubes, Hamming cubes, Euclidean spaces with standard 
Gaussian distribution, etc. - it can be proved that each of 
these families results in an equivalent definition. 

The axioms immediately lead to a paradoxical conclusion. 
Since the Euclidean spheres §" of radius one with the 

'Recall that fin) = Q(g(n)) if there exist constants < c < C and 
an N with c\f(n)\ < \g(n)\ < C\f(n)\ for all n> N. One says that the 
functions / and g asymptotically have the same order of magnitude. 



rotation-invariant probability measure form a Levy family 
iflOl , |5l , they converge to a singleton {*} with regard to 
Gromov's distance, and Axioms 1 and 2 (or 2+3) imply that 

9({*}) = +oo. 

The converse is also true. Let (X, d, /i) be a space with 
metric and measure such that the support of /i is all of X. 

Theorem 3: Let d be an intrinsic dimension function. 
Then d(X) — +oo if and only if X is a singleton: X ~ {*}. 

Proof: If d(X) = +oo, then the constant sequence 
X n = X is a Levy sequence, and so ObsDiam (X) = 0. 
This is only possible when /i is Dirac's point mass. ■ 

Thus, the one and only infinite-dimensional object in a 
theory is a single point! This paradox seems to be unavoid- 
able if one wants a notion of intrinsic dimension capable of 
detecting the curse of dimensionality, however it does not 
seem to lead to any problems or inconveniences. 

Perhaps even more surprising is the fact that a dimension 
function satisfying the above requirements actually exists. 

B. Example: concentration dimension 

For an space with metric and measure (X, d, p), define 

1 



dim a (X) = 



2 Jo a x(e) de 



(2) 



We call dim Q (Jf) the concentration dimension of X. 

Theorem 4: The function dim Q is an intrinsic dimension 
function. 

Proof: Axiom 1 follows at once from a standard 
result in Real Analysis (Lebesgue's Dominated Convergence 
Theorem). Axiom 2 involves a geometrical argument, to be 
published elsewhere. Axiom 3 is based on results obtained 
decades ago by Paul Levy [8] (cf. also flU, 0)- The 
inequality dim Q (§") = f2(n) follows from a standard 
Gaussian upper bound on the concentration function of the 
sphere (TO], 

a§n(e) < Ci cxp(-C 2 £ 2 n). 

On the other hand, the value of concentration function ag™ (e) 
is the relative n-volume of a spherical cap of height 1 — e, 
and Levy's calculations show that in order for a spherical cap 
to keep a constant relative volume as n — > oo, the height of 
such a cap should be on the order e = 1 — Q(l/y/n). This 
suffices to obtain the other inequality: dim a (S n ) = 0(n). ■ 
Remark 5: One can replace 1 with any fixed real number 
L > as the upper limit of integration in Eq. (|2). It would 
be more natural to integrate to +oo and set 



dim a (X) 



1 



[2j °°a(s)de}- 



(3) 



2 Recall that f(n) = Q(g(n)) if for a constant C > and a natural N 
one has |/(n)| > C\g(n)\ for all n > N . It is easy to see that the condition 
/ = O(g) is equivalent to the conjunction of / = O(g) and / = Q(g). 



however Axiom 1 will no longer hold. Let X = [1, +oo) be a 
semi-infinite interval with the usual distance d(x, y) — \x—y\ 
and probability density p(x) = l/x 2 . Now one has 

so f °° ax (e) de diverges to infinity. The concentration di- 
mension of such a space in the sense of Eq. d3j is zero. 
One can modify this example and obtain a Levy family 
of spaces with vanishing concentration dimension. Still, for 
all practical purposes it is more convenient to assume the 
definition in Eq. (O and restrict it to spaces with integrable 
concentration function (including, for instance, all spaces 
with bounded metric). 

Even if the concept of concentration dimension is intro- 
duced here for the first time, some known results can be 
reformulated in such a way as to underscore its theoretical 
relevance. Particular instances of the following theorem are 
well-known and often used, although in a different disguise 
(cf. IflOl . p. 60), so we leave the proof out. 

Theorem 6: The median and the mean of a 1-Lipschitz 
function / on a space (X, d, fi) differ between themselves 
by at most 1/ Wdhn^J~X) (in the sense of Eq. (0). ■ 

Euclidean spheres S n of unit radius are among very few 
concrete families of geometric objects for which the exact 
values of dim Q can be computed. (Cf. Fig. |U) 
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Fig. 4 

Concentration dimension of n-sPHERES for all 2 < n < 101. 



Example 7: Let 



jn—l 



= {x G R n+1 : xi = i, xi+xl + ... + x 2 n = 1}, 



where i = 0, 1, be two copies of the unit sphere S I1_1 sitting 
inside R n+1 at a distance 1 from and parallel to each other. 
Consider their union 

x™ = s r ]- 1 us5 1 - 1 . 

(Cf. Fig. El) 

Equip X n with the Euclidean distance coming from R™ +1 
and define a probability measure /i as follows: fi(A) = 



the first 

coordinate 

projection 




Fig. 5 

The space X n from Example[7] 



M (n-i)(g«-i nvl ) +M («-i)(§n-i nJ 4). (Here /A" 1 ) is the 
rotation-invariant measure on S™^ 1 .) 

Among all subsets A of measure > |, those whose e- 
neighbourhoods have the smallest measure are exactly the 



spheres §™~ 



0, 1, which form two well-separated clus- 



ters inside X n . The concentration function of X n satisfies 

if < e < 1, 



«x» (e) 



2 ' 





otherwise, 



and dim conc (X n ) = 1 for all n, another type of paradoxical 
behaviour! 

This agrees with the fact that the sphere § n_1 of high 
dimension is close (in the Gromov distance) to a singleton, 
and therefore X n is close to the two-point space {0, 1}. 
A low value of the concentration dimension indicates the 
existence of a well-separating feature: the first coordinate 
projection X n -> {0, 1}. 

C. The intrinsic dimensionality of Chavez et al. 

The following interesting version of intrinsic dimension 
was proposed by Chavez et al. who called it simply 
intrinsic dimensionality. The concept explores a well-known 
property of high-dimensional spaces: the values of distances 
between points are sharply concentrated near one value (the 
characteristic size of X), cf. Fig. [6] 

Let (X, d, n) be a space with metric and measure. Denote 
by m(d) the mean of the distance function d: X x X — ► 
R on the space X x X with the product measure. Assume 
m{d) < oo. (This is not always the case: consider the space 
from Remark [5]) Let a(d) be the standard deviation of the 
same function. The intrinsic dimensionality of X is defined 
as 

m .(d) 



dim dist (X) 



2a 2 (d)' 



(4) 



Theorem 8: The intrinsic dimensionality of Chavez et al. 
satisfies: 

• a weaker version of Axiom 1: if (X n ,d n , fi n ) is a 
Levy family of spaces with bounded metrics, then 

dim. dist (X n ,{*}) -> oo, 

• A weaker version of Axiom 2: if d conc (X n , X) — > 
and m(d n ) — * m(d), then dim dist (X n ) — > dim dlst (X), 

• Axiom 3. 

Proof: For the first property, notice that if (X n ) is a 
Levy family, then so is (X n x X n ), and the distance function 





Fig. 6 

Distribution of distances between randomly chosen pairs of 
points in the unit hypercubecube i™, n = 3, 10, 100, 1000. (each 
HISTOGRAM is based on a random sample of 10,000 pairs.) 



d n concentrates near its median value, which can be replaced 
with the mean value by Theorem [6] 

The second property follows immediately from a similar 
property of the concentration dimension, while the proof of 
Axiom 3 uses symmetries of the sphere and is similar to the 
proof of Axiom 3 for the concentration dimension. ■ 

Remark 9: For a singleton Eq. returns jj, and this value 
is genuinely undefined. Indeed, denote by eXn a space with 
N points at a distance of e from each other, equipped with 
the normalized counting measure. It is easy to see that 

dim. dis t(eXfq) = Q(N) — > +oo as N — ► oo. 

When e — > 0, each of the spaces eX^ converges to a 
singleton in Gromov's distance, and so one cannot assign 
any particular value to the intrinsic dimension dirridj S { ({*})• 

This difference in behaviour is due to the fact that the 
intrinsic dimensionality is not an exact analogue of our con- 
centration dimension, but rather of its normalized analogue 
d\m. conc (X) x CharSize(X) 2 . 

Example 10: The concentration function of the space 
X^ = 1 • Xn as above is easy to compute: 



a x N (e) 



l 

2 ' 

0, 



if e < 1, 
if e > 1, 



and so dim conc (Xjv) = 1 for every N. At the same time, 
dim dist (XN) — > oo, even as CharSize (Xn) — 6(1). 

One can argue that in Example [TOl the intrinsic dimension- 
ality of Chavez et al. gives away more useful information 
than the concentration dimension, because the spaces Xn 
are often used to illustrate the curse of dimensionality in the 
context of similarity search as a toy example [1]. This case, 
which may or may not qualify as a genuine specimen of the 
"curse of dimensionality" (when finding nearest neighbours 



is easy, it just just outputting them all that is expensive), is 
indeed missed by our approach. 

Example 11: The intrinsic dimensionality of the spaces 
X n from Example [7] (cf. Fig. [5j is uniformly bounded over 
all n. Indeed, the mean distance between two random points 
x, y 6 X n goes to y2 as n — > oo provided x, y are from the 
same sphere, and to V3 otherwise. Since the two events are 
equiprobable, m(d) -> (y/3 + \/2)/2. Similarly, er 2 (ef) -> 
(a/3 - \/2) 2 /4, and 



dim dlst (X") 



V3- V2, 



97.99 as rt — * oo. 



TABLE I 

Estimates of intrinsic dimensionality of spaces X n from Ex.[JJ 



n 


2 


3 


10 


30 


100 


1000 


5000 




6.7 


11.2 


34.0 


61.7 


83.5 


96.3 


97.7 



See Table |T] for estimates of dim.dist(X n ) for selected 
values of n, based on the distance distribution of randomly 
sampled 3 • 10 5 pairs (elements of X n x X n ). Keep in mind 
that the topological dimension of X n is n — 1, while the 
concentration dimension is 1, 

D. Some other approaches to instrinsic dimension 

The approaches to intrinsic dimension listed below are 
all quite different both from our approach and from that of 
Chavez et al., in that they are set to emulate various versions 
of topological (i.e. essentially local) dimension. All of them 
fail both our Axioms 1 and 2 and satisfy dim(X n ) = 0(n) 
for the two-sphere space X n from Example [7] 

• Correlation dimension, which is a computationally effi- 
cient version of the box-counting dimension, see Q, fl5l . 

• Packing dimension, or rather its computable version as 
proposed and explored in |6|. 

• Distance exponent lfl6l . which is a version of the well- 
known Minkowski dimension. 

• An algorithm for estimating the intrinsic dimension 
based on the Takens theorem from differential geometry fl4l . 

• A non-local approach to intrinsic dimension estimation 
based on entropy-theoretic results is proposed in [4], however 
in case of manifolds the algorithm will still return the 
topological dimension, so the same conclusions apply. 

IV. Conclusions 

We have proposed a mathematical formalism for dealing 
with intrinsic dimension functions of datasets (as well as 
more general geometric objects) satisfying two requirements: 
a high intrinsic dimension is indicative of the curse of 
dimensionality, and closeness of two objects to each other 
implies the values of intrinsic dimension are also close. We 
formulate these conditions in a rigorous way, and demon- 
strate that a dimension function with such properties exists. 
We also discuss some of its paradoxical properties, such as, 
for instance, the infinite value of intrinsic dimension of a 
single point. 



This dimension function, interesting as it may be, has 
two serious deficiencies. First, from the computational per- 
spective it appears to be, generally speaking, untractable. 
Second, even if known, it need not be usable. A low value 
of dimension dim Q indicates at an existence of a 1-Lipschitz 
function / on X that is well dissipated (has high variance), 
and the corresponding "geodesic flow" gives a principal 
curve for X. However, it may happen that such an / has 
very high complexity (examples are distance functions from 
large, complicated subsets of X). In applications, one is more 
interested in a situation where the features come from a 
specified class T of low-cost functions. (For example, in 
theory of indexing for similarity search, T may consist of 
distance functions to points.) Developing a corresponding 
concept of an intrinsic dimension function may solve both of 
the above problems, and here [3] can serve as an important 
case study. 

We also discuss the Gromov distance between spaces with 
metric and measure. This distance per se is computationally 
even harder to estimate. However, notice that any intrinsic 
dimension function gives at least a qualitative estimate on 
the closeness of a dataset X to the one-point space {*}. A 
similar estimate would be much more interesting and useful 
were a singleton replaced by a two point, or, better still, a 
k point space (i.e., a singular principal manifold). This is an 
obvious next step to explore. Very likely, such estimates are 
already implicitely present in the great body of existing work 
on principal manifolds. 
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